Overview

This parameterized markdown file is a standarized exploratory analysis report, examining the data type, structure, missing values, unique values, summary statistics for continuous and categorical variables contained within a data.frame object.

This does not support date-time variables such as POSIXct or Date, but will eventually move to support all data types typically found in data.frame in the R environment.

Data Structure

The input file airquality is a data.frame. It contains 153 rows and 6 columns. The table below summarizes the data structure of the object. The column names are displayed in the code chunk below.

Dataset Dimensions
Name Number of Columns Number of Rows Number of Elements Memory Allocation
airquality 6 153 918 5.5 Kb
Dataset Variable Class and Missing Values
Variable Class missing
Ozone integer 37
Solar.R integer 7
Wind numeric 0
Temp integer 0
Month integer 0
Day integer 0

Missing Values, Data Type, Unique Values

Table

The table below displays the data type, counts/proportion of missing values NA, and whether the data is labelled or not.

Plot

Unique Values

Variable Content

This section contains an analysis of the variable content composition by describing the type of data content that exist based on numeric values, character values, punctuation or symbols, and blank space. This is especially useful when data types are not identified or misclassified.

Only Numbers

only_numbs reports the number of rows that contain only digits "^[0-9]{1,}$" and numpercent is the respective percentage. digits_min and digits_max display the length of the digits found in the variable, digits_eq is a logical statment of whether the mix and max are equal.

Only Character

only_char reports the number of rows that contain only characters "^[A-z]{1,}" and charpercent is the respective percentage.

Only Punctuation

only_punc reports the number of rows that contain only punctuation or symbols "^\\W+$" and puncpercentage is the respective percentage.

Only Blanks

only_blanks reports the number of rows that contain a blank (“”) "^\\s{0}$" and blankspercent is the respective percentage. only_wspace reports the number of rows that contain any white-space, including a blank "^\\s*$" and wspacepercent is the respective percentage.

All Types

Variable Content Plot

Generates an interactive plotly bar plot of variable content by raw counts.

Continuous Variables

‘Ozone’

Summary Statistics for Ozone
Min Mean Median Max Standard Deviation
1 42.13 31.5 168 32.99

‘Solar.R’

Summary Statistics for Solar.R
Min Mean Median Max Standard Deviation
7 185.93 205 334 90.06

‘Wind’

Summary Statistics for Wind
Min Mean Median Max Standard Deviation
1.7 9.96 9.7 20.7 3.52

‘Temp’

Summary Statistics for Temp
Min Mean Median Max Standard Deviation
56 77.88 79 97 9.47

‘Month’

Summary Statistics for Month
Min Mean Median Max Standard Deviation
5 6.99 7 9 1.42

‘Day’

Summary Statistics for Day
Min Mean Median Max Standard Deviation
1 15.8 16 31 8.86


R Session Info

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: OS X El Capitan 10.11.6
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] plotly_4.8.0       bindrcpp_0.2.2     htmlwidgets_1.3   
##  [4] DT_0.4             kableExtra_0.9.0   gridExtra_2.3     
##  [7] ggfortify_0.4.5    scales_1.0.0       ggplot2_3.1.0     
## [10] stringr_1.3.1      forcats_0.3.0      data.table_1.11.4 
## [13] purrr_0.2.5        tidyr_0.8.1        haven_1.1.2       
## [16] telegram.bot_2.2.0 dplyr_0.7.8        rmarkdown_1.11    
## [19] here_0.1          
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0         utf8_1.1.4         assertthat_0.2.0  
##  [4] rprojroot_1.3-2    digest_0.6.18      mime_0.6          
##  [7] R6_2.3.0           plyr_1.8.4         backports_1.1.2   
## [10] evaluate_0.12      httr_1.4.0         highr_0.7         
## [13] pillar_1.3.1       rlang_0.3.1.9000   lazyeval_0.2.1    
## [16] curl_3.3           rstudioapi_0.9.0   labeling_0.3      
## [19] webshot_0.5.1      readr_1.3.1        munsell_0.5.0     
## [22] shiny_1.2.0        compiler_3.5.1     httpuv_1.4.5.1    
## [25] xfun_0.4           pkgconfig_2.0.2    htmltools_0.3.6   
## [28] tidyselect_0.2.5   tibble_2.0.1       fansi_0.4.0       
## [31] viridisLite_0.3.0  crayon_1.3.4       dbplyr_1.2.2      
## [34] withr_2.1.2        later_0.7.5        jsonlite_1.6      
## [37] xtable_1.8-3       gtable_0.2.0       DBI_1.0.0         
## [40] magrittr_1.5       cli_1.0.1          stringi_1.2.4     
## [43] promises_1.0.1     xml2_1.2.0         RColorBrewer_1.1-2
## [46] tools_3.5.1        Cairo_1.5-9        glue_1.3.0        
## [49] hms_0.4.2          crosstalk_1.0.0    yaml_2.2.0        
## [52] colorspace_1.4-0   rvest_0.3.2        knitr_1.21        
## [55] bindr_0.1.1



Process Time: 0.1 minutes
R version 3.5.1 (2018-07-02)